智能论文笔记

Kendall transformation: a robust representation of continuous data for information theory

Miron Bartosz Kursa

分类：机器学习 | (统计)机器学习

2020-06-29

Kendall转换是将有序功能转换为单个值之间的成对订单关系的向量。这样，它保留了观察的排名，并以分类形式表示它。这种转化允许需要严格分类输入的方法的概括，尤其是在离散方式发生问题时在少量观察的极限中。特别地，可以直接应用信息理论方法，而不依赖于差分熵或任何附加参数。此外，通过将信息过滤到排名中的信息，Kendall转换以合理的成本导致更好的稳健性，其丢弃了复杂的相互作用，这不太可能被正确估计。在双变量分析中，肯德尔转型可以与流行的非参数方法有关，呈现方法的健全性。本文还展示了其在多变量问题中的效率，并提供了对真实数据的示例分析。

translated by 谷歌翻译

Interpretable ML for Imbalanced Data

Damien A. Dablain , Colin Bellinger , Bartosz Krawczyk , David W. Aha , Nitesh V. Chawla

分类：机器学习

2022-12-15

Deep learning models are being increasingly applied to imbalanced data in high stakes fields such as medicine, autonomous driving, and intelligence analysis. Imbalanced data compounds the black-box nature of deep networks because the relationships between classes may be highly skewed and unclear. This can reduce trust by model users and hamper the progress of developers of imbalanced learning algorithms. Existing methods that investigate imbalanced data complexity are geared toward binary classification, shallow learning models and low dimensional data. In addition, current eXplainable Artificial Intelligence (XAI) techniques mainly focus on converting opaque deep learning models into simpler models (e.g., decision trees) or mapping predictions for specific instances to inputs, instead of examining global data properties and complexities. Therefore, there is a need for a framework that is tailored to modern deep networks, that incorporates large, high dimensional, multi-class datasets, and uncovers data complexities commonly found in imbalanced data (e.g., class overlap, sub-concepts, and outlier instances). We propose a set of techniques that can be used by both deep learning model users to identify, visualize and understand class prototypes, sub-concepts and outlier instances; and by imbalanced learning algorithm developers to detect features and class exemplars that are key to model performance. Our framework also identifies instances that reside on the border of class decision boundaries, which can carry highly discriminative information. Unlike many existing XAI techniques which map model decisions to gray-scale pixel locations, we use saliency through back-propagation to identify and aggregate image color bands across entire classes. Our framework is publicly available at \url{https://github.com/dd1github/XAI_for_Imbalanced_Learning}

translated by 谷歌翻译

Climate Policy Tracker: Pipeline for automated analysis of public climate policies

Artur Żółkowski , Mateusz Krzyziński , Piotr Wilczyński , Stanisław Giziński , Emilia Wiśnios , Bartosz Pieliński , Julian Sienkiewicz , Przemysław Biecek

分类：自然语言处理

2022-11-10

The number of standardized policy documents regarding climate policy and their publication frequency is significantly increasing. The documents are long and tedious for manual analysis, especially for policy experts, lawmakers, and citizens who lack access or domain expertise to utilize data analytics tools. Potential consequences of such a situation include reduced citizen governance and involvement in climate policies and an overall surge in analytics costs, rendering less accessibility for the public. In this work, we use a Latent Dirichlet Allocation-based pipeline for the automatic summarization and analysis of 10-years of national energy and climate plans (NECPs) for the period from 2021 to 2030, established by 27 Member States of the European Union. We focus on analyzing policy framing, the language used to describe specific issues, to detect essential nuances in the way governments frame their climate policies and achieve climate goals. The methods leverage topic modeling and clustering for the comparative analysis of policy documents across different countries. It allows for easier integration in potential user-friendly applications for the development of theories and processes of climate policy. This would further lead to better citizen governance and engagement over climate policies and public policy research.

translated by 谷歌翻译

Diversity-Promoting Ensemble for Medical Image Segmentation

Mariana-Iuliana Georgescu , Radu Tudor Ionescu , Andreea-Iuliana Miron

分类：计算机视觉 | 机器学习

2022-10-22

Medical image segmentation is an actively studied task in medical imaging, where the precision of the annotations is of utter importance towards accurate diagnosis and treatment. In recent years, the task has been approached with various deep learning systems, among the most popular models being U-Net. In this work, we propose a novel strategy to generate ensembles of different architectures for medical image segmentation, by leveraging the diversity (decorrelation) of the models forming the ensemble. More specifically, we utilize the Dice score among model pairs to estimate the correlation between the outputs of the two models forming each pair. To promote diversity, we select models with low Dice scores among each other. We carry out gastro-intestinal tract image segmentation experiments to compare our diversity-promoting ensemble (DiPE) with another strategy to create ensembles based on selecting the top scoring U-Net models. Our empirical results show that DiPE surpasses both individual models as well as the ensemble creation strategy based on selecting the top scoring models.

translated by 谷歌翻译

Active Few-Shot Classification: a New Paradigm for Data-Scarce Learning Settings

Aymane Abdali , Vincent Gripon , Lucas Drumetz , Bartosz Boguslawski

分类：机器学习

2022-09-23

我们考虑了一个新颖的表述，即主动射击分类（AFSC）的问题，其目的是对标签预算非常限制的小规定，最初未标记的数据集进行分类。这个问题可以看作是与经典的跨托管少数射击分类（TFSC）的竞争对手范式，因为这两种方法都适用于相似的条件。我们首先提出了一种结合统计推断的方法，以及一种非常适合该框架的原始两级积极学习策略。然后，我们从TFSC领域调整了几个标准视觉基准。我们的实验表明，AFSC的潜在优势可能是很大的，与最先进的TFSC方法相比，对于同一标签预算，平均加权准确性高达10％。我们认为，这种新的范式可能会导致数据筛选学习设置的新发展和标准。

translated by 谷歌翻译

Deep learning automates bidimensional and volumetric tumor burden measurement from MRI in pre- and post-operative glioblastoma patients

Jakub Nalepa , Krzysztof Kotowski , Bartosz Machura , Szymon Adamski , Oskar Bozek , Bartosz Eksner , Bartosz Kokoszka , Tomasz Pekala , Mateusz Radom , Marek Strzelczak

分类：计算机视觉

2022-09-03

通过磁共振成像（MRI）评估肿瘤负担对于评估胶质母细胞瘤的治疗反应至关重要。由于疾病的高异质性和复杂性，该评估的性能很复杂，并且与高变异性相关。在这项工作中，我们解决了这个问题，并提出了一条深度学习管道，用于对胶质母细胞瘤患者进行全自动的端到端分析。我们的方法同时确定了肿瘤的子区域，包括第一步的肿瘤，周围肿瘤和手术腔，然后计算出遵循神经符号学（RANO）标准的当前响应评估的体积和双相测量。此外，我们引入了严格的手动注释过程，其随后是人类专家描绘肿瘤子区域的，并捕获其分割的信心，后来在训练深度学习模型时被使用。我们广泛的实验研究的结果超过了760次术前和504例从公共数据库获得的神经胶质瘤后患者（2021 - 2020年在19个地点获得）和临床治疗试验（47和69个地点，可用于公共数据库（在19个地点获得）（47和69个地点）术前/术后患者，2009-2011）并以彻底的定量，定性和统计分析进行了备份，表明我们的管道在手动描述时间的一部分中对术前和术后MRI进行了准确的分割（最高20比人更快。二维和体积测量与专家放射科医生非常吻合，我们表明RANO测量并不总是足以量化肿瘤负担。

translated by 谷歌翻译

Entity Graph Extraction from Legal Acts -- a Prototype for a Use Case in Policy Design Analysis

Anna Wróblewska , Bartosz Pieliński , Karolina Seweryn , Karol Saputa , Aleksandra Wichrowska , Sylwia Sysko-Romańczuk , Hanna Schreiber

分类：自然语言处理

2022-09-02

本文介绍了有关开发的原型的研究，以服务公共政策设计的定量研究。政治学的这种子学科着重于确定参与者，之间的关系以及在健康，环境，经济和其他政策方面可以使用的工具。我们的系统旨在自动化收集法律文件，用机构语法注释它们的过程，并使用超图来分析关键实体之间的相互关系。我们的系统经过了《联合国教科文组织公约》的保护，以保护2003年的无形文化遗产，这是一份法律文件，该文件规定了确保文化遗产的国际关系的基本方面。

translated by 谷歌翻译

HTML版本

ProPaLL: Probabilistic Partial Label Learning

Łukasz Struski , Jacek Tabor , Bartosz Zieliński

分类：机器学习 | 人工智能

2022-08-21

部分标签学习是一种弱监督的学习，每个培训实例都对应于一组候选标签，其中只有一个是正确的。在本文中，我们介绍了一种针对此问题的新型概率方法，与现有方法相比，该方法至少具有三个优势：它简化了训练过程，改善了性能并可以应用于任何深层体系结构。对人工和现实世界数据集进行的实验表明，诺言的表现优于现有方法。

translated by 谷歌翻译

AG2U -- Autonomous Grading Under Uncertainties

Yakov Miron , Yuval Goldfracht , Chana Ross , Dotan Di Castro , Itzik Klein

分类：机器人

2022-08-04

表面分级是在施工现场管道中的一项重要任务，这是平衡含有预倾角沙桩的不平衡区域的过程。这种劳动密集型过程通常是由任何建筑工地的关键机械工具推土机进行的。当前的自动化表面分级的尝试实现了完美的定位。但是，在实际情况下，由于代理人的感知不完善，因此该假设失败了，从而导致性能降解。在这项工作中，我们解决了不确定性下自动分级的问题。首先，我们实施模拟和缩放现实世界原型环境，以在此环境中快速策略探索和评估。其次，我们将问题形式化为部分可观察到的马尔可夫决策过程，并培训能够处理此类不确定性的代理商。我们通过严格的实验表明，经过完美本地化训练的代理人在出现本地化不确定性时会遭受降低的性能。但是，使用我们的方法培训的代理商将制定更强大的政策来解决此类错误，从而表现出更好的评分性能。

translated by 谷歌翻译

DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation

Vladimir Tchuiev , Yakov Miron , Dotan Di-Castro

分类：机器人

2022-07-19

混乱场景中的物体操纵是机器人技术中的一个困难和重要问题。为了有效地操纵物体，重要的是要了解它们的周围环境，尤其是在将一个物体堆叠在另一个物体的情况下，以防止有效抓握。我们在这里提出Duqim-Net，这是一种在堆叠对象的设置中进行对象操作的决策方法。在DUQIM-NET中，使用Adj-Net评估层次堆叠关系，该模型通过添加邻接头来利用现有的变压器编码器编码器对象检测器。该头部的输出概率地渗透了场景中对象的基础层次结构。我们利用DUQIM-NET中的邻接矩阵的属性来执行决策并协助对象抓任务。我们的实验结果表明，ADJ-NET超过了视觉操作关系数据集（VMRD）的对象关系推断的最新技术，并且DUQIM-NET在bin清除任务中的表现优于可比的方法。

translated by 谷歌翻译